Amazon & Microsoft DC fired by Lightning storm
On August 7, a lightning storm broke out in Dublin (Ireland) that was able to cause the energy backup systems of the Amazon Web Services and Microsoft BPOS (Business Productivity Online Suite) Datacenters located in Ireland to go haywire.
In the middle of summer in Dublin an electrical storm knocks out the data system of the two IT giants, the automatic electricity backup system has blown up to both who had to activate it manually, for hours the technicians have worked to restore everything and are still working on it by returning to customers the data recovery on the EBS disks. Emails arrive to customers as they recover data from snapshots or EBS volumes, evidently the electrical damage has compromised the replication network of the EBS storage system, as happened in April (due to human error).
This second case on Amazon unfortunately makes us reflect on the reliability of the block storage system called Elastic Block Storage, it is delicate, dependent on the right replication bandwidth, such as to risk the consistency of the data and in the cases of bootable EC2 EBS instances, stop the service provided without realizing it (the machine is up), you must have an external alert as the CloudWatch does not monitor the health of the services, or at least not directly.
It is obvious that a lightning strike is an extraordinary event, but one must also wonder how lightning could have fallen on an electrical equipment since power plants are usually very well equipped with well-sized lightning rods that are supposed to divert the path of the lightning. In addition, two datacenters of different customers, perhaps close and powered by the same power line, but both equipped with the same automatic energy backup system, a system that failed in both cases. I believe and hope that both Amazon and Microsoft will be able to legally assert themselves against the manufacturer of the blown device and the municipality or whoever is responsible for not having brought the lightning rod system up to standard.
Furthermore, it is necessary to reflect on the massive use that is conveniently made of EBS disks, fast, non-volatile, resizable, snapshottatable, etc. Maybe you need to better plan alternative backups, synchronizations to S3, copies of snapshots in other datacenters etc etc.